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MULTIPLE LOCAL WHITTLE ESTIMATION 
IN STATIONARY SYSTEMS 

By p. M. Robinson^ 

London School of Economics 

Moving from univariate to bivariate jointly dependent long-memory 
time series introduces a phase parameter (7), at the frequency of prin- 
cipal interest, zero; for short-memory series 7 = automatically. The 
latter case has also been stressed under long memory, along with the 
"fractional differencing" case 'y — {S2 — Si)Tr /2, where 5i, S2 are the 
memory parameters of the two series. We develop time domain con- 
ditions under which these are and are not relevant, and relate the 
consequent properties of cross-autocovariances to ones of the (possi- 
bly bilateral) moving average representation which, with martingale 
difference innovations of arbitrary dimension, is used in asymptotic 
theory for local Whittle parameter estimates depending on a sin- 
gle smoothing number. Incorporating also a regression parameter (/?) 
which, when nonzero, indicates cointegration, the consistency proof 
of these implicitly defined estimates is nonstandard due to the /3 esti- 
mate converging faster than the others. We also establish joint asymp- 
totic normality of the estimates, and indicate how this outcome can 
apply in statistical inference on several questions of interest. Issues 
of implemention are discussed, along with implications of knowing /3 
and of correct or incorrect specification of 7, and possible extensions 
to higher-dimensional systems and nonstationary series. 

1. Introduction. In the analysis of long-memory time series, two major 
issues emerge in multivariate extension of univariate results. One is the pos- 
sibility of cointegration, whereby one or more linear combinations of the 
(stationary or nonstationary) observables reduces memory. In general, rules 
of large sample inference based on a no-cointegration assumption are invali- 
dated by cointegration, and vice versa. The literature on cointegration under 
long memory is dwarfed by that under autoregressive (AR) unit roots, but 
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has been developed in several directions recently. Another distinctive multi- 
variate feature, which has attracted very little attention, is phase, essentially 
the argument in polar co-ordinate representation of the cross-spectrum. This 
is a particularly interesting issue in a "semiparametric" setting, where the 
spectral density matrix is modeled only near zero frequency. For a jointly co- 
variance stationary short-memory process, this matrix is continuous at zero 
frequency; thus, since the quadrature spectrum (the imaginary part of the 
cross-spectrum) is an odd function, it, and thus the phase, are zero there. 
In long-memory series, on the other hand, where spectra diverge at zero 
frequency, the cross-spectrum is discontinuous there, and the phase need 
not be zero. In the literature, essentially two values for the phase have been 
considered, albeit rather implicitly, with little discussion of implications. 

The present paper develops large sample statistical inference, in a possi- 
bly cointegrated system, with unknown phase. The formal results focus on a 
bivariate system, extension of our techniques for establishing asymptotic sta- 
tistical theory to a system of arbitrary dimension being seemingly relatively 
straightforward, albeit introducing issues of specification and implementa- 
tion, whose detailed treatment would be lengthy; we include a brief discus- 
sion. We also focus on covariance stationary observable series. This becomes 
a theoretical possibility when we switch from an AR unit root cointegra- 
tion setting to a fractional one, and it has been of recent practical interest 
in financial time series analysis. We include, however, a brief discussion of 
possible nonstationary extensions. 

Consider a bivariate jointly covariance stationary process ut = {uit,U2ty , 
having spectral density matrix fu{X) that satisfies 

(1.1) /„(A)~$(A;ao)"^f^o^(A;ao)-^ as A ^ 0, 

(1.2) $(A;a) =diag{|A|^MA|'^2g-isign{A)7|^ A G (-^, 0) U (0, vr]. 

Here, a = (7, 6')' for 6 = (61,62)' , where 7, 61 and 62 are real-valued, 70 and 
60 = {601,602)' in ao = (7o,<^o)' unknown, 6oi € [0, i), i = 1,2, flo is an 
unknown 2x2 positive definite matrix, and the overbar indicates complex 
conjugation; the notation "~" in (1.1) means that for each element, the 
ratio of real/imaginary parts of the left and right sides tends to 1 (taking 
0/0 = 1). 

From (1.1), Uit is said to have memory (parameter) 6oi, its spectral density 
fi{X) satisfying 

/,(A)~Wii|A|-2^« asA^O, i = l,2, 

where ujij is the (i,j)th element of Q. We deduce also that uit,U2t have 
cross-spectrum /12(A) [the top right element of /m(A)] satisfying 

(1.3) /12(A) ~u;i2|Ar»e-*^'sn{A)7o as A ^ 0, 
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where xo = ^02 + ^oi- Then (see, e.g., [4], page 302, [12], page 48) 70 is 
the phase between uit,U2t at A = 0. There is no loss of generahty in the 
restriction 70 € (— 7r,7r]. Thus the local approximation on the right of (1.3) 
is real- valued only if 1012 = and/or 

(1.4) 70 = 0. 

To deduce another leading possibility, which applies to an extension of 
the fractional ARMA class, a general model for /«(A) is 

(1.5) A(A) = T(A;ao)-V*(A)T(A;ao)-\ A G (-^,0) U (0,7r], 
where T(A;a) = diag{i;(A)'^i , i;(A)'^2e-isign(A)7}^ ^(^) = (1 - e*^)e*'^*g°(^)^/2 
and /*(A) is continuous and Hermitian positive definite at A = 0. Since 
v{X) ~ |A| as A ^ 0, (1.1) holds. On the other hand, with z/q = S02 — ^oi, 

(1-6) 70 = ^T^o 

gives T(A;ao) = diag{(l - e*^)^oS (1 - e*^)'5o2}e*"'sn(A)5oi7r/2^ ^^^^^ ^^le 
scalar factor has modulus 1, ut fractionally integrates an /(O) process; if the 
latter is ARMA, ut is fractional ARMA. [Note that (1.6) reduces to (1.4) 
when 601 = So2-] However, the fractional integration operator was originally 
motivated in a parametric framework [1], and in a semiparametric one there 
seems no overriding reason to fix 70. More generally, (1.1) with 70 = (5o2 — 
Soi)ctt/2 can be shown to result from generalizing the fractional differencing 
filter 1 - e'^ to (1 - e*l^l'^'^'g'iW)'=, c / 0. 

We can investigate the time domain implications of general 70. The proof 
of the following theorem is left to Section 5. 

Theorem 1. Denoting ri2{j) = cov{uij,U2o), j G assume Xo > and, 
for fK+,K_)/(0,0), 

(1.7) bj = ruU) - {K+l(i > 0) + K_l(i < 0)}\j\^^-' 
satisfies 

(1.8) \bj - < K\bj\/{\j\ + 1), b, = o{\jr~') as \j\ ^ oo, 

where K throughout denotes an arbitrarily large positive generic constant. 
Then (1.3) holds with 

f / K+ — ft;_ \ vr 
70 = arctan-^ ■ tan -xo 

(1.9) 

uji2 = (k+ + K_)r(xo)cos(7rxo/2)/(27rcos7o). 

In particular: 

TT 

(1.10) K_ = is equivalent to 70 = "^XOi ^12 = '«+r(xo)/(2vr), 

TT 

(1.11) = is equivalent to Jo = ——Xo, (^12 = '^-r(xo)/(2'/r). 
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Solving (1.9) gives k± = 7rwi2 sin(7rxo/2 ± 7o)/r(xo)- In view of (1.7) and 
the second part of (1.8), ri2(j) dominates ri2(— j) as j — > oo in (1.10), and 
vice versa in (1.11), while they decay at equal rates otherwise. The first 
part of (1.8) implies, with (1.7), an analogous condition for ri2(j), which is 
satisfied by vector fractional ARMA processes. When k+ = k_ in (1.9), the 
power-law approximation is symmetric in j, and (1.4) results. On the other 
hand, (1.10) is a kind of weak causality («2 ui) condition; it agrees with 
(1.6) only if Sqi = 0. In general, the theorem indicates that any value of 70 
is a possibility. 

For the bivariate series zt = {yt, xt)' , observed for t = 1, . . . ,n, consider the 
system 

(1.12) BoZt = ut, teZ, Bo = 

with Pq unknown, so uu is unobservable. When 6qi > (5o2, (3o cannot be 
identified [from the spectral density matrix /x(A) of zt near A = 0] unless 0,q 
is suitably restricted, for example, W12 is known. When 5qi 7^ 5q2, and (3q = 0, 
yt and xt have unequal memories 5oi,6o2, respectively. When 5oi < 602 and 
/?o 7^ 0, then both Xf and yt have the same memory 5o2, but the unobservable 
linear combination uu = yt — PoXt has less memory, Sqi, and xt and yt are 
said to be cointegrated. Both have a dominant common component with 
memory 602, and so a dimensionality reduction is achievable: 

(1.13) /,(A)~(/?o,l)'(/?o,l)a;22|Ar2^'«, asA^O. 

The right-hand side of (1.13) is singular, and the cointegrating error uu 
has memory Jqi- Included is the possibility that 6qi = 0, when uu has short 
memory. We focus on estimating = iPoiCt'oY under 

(1.14) 0<5oi<'^02<|, 

covering cointegrated systems {(3q ^ 0), and, for 5qi < 6q2, noncointegrated 
ones (/3o = 0). 

In [31] estimation of /Jq in (1.12) was discussed with zt exhibiting quite 
general forms of nonstationarity, and uu being stationary or nonstationary. 
Reference [27] pointed out that cointegration is possible even when zt is 
stationary with long memory, as might be true of certain financial time 
series, say, and a number of references (e.g., [6, 23, 24]) have developed theory 
and applications in this setting. Financial time series are often very long, 
motivating reliance on only the "semiparametric," local, assumption (1.1). 
This justifies methods with only slow convergence rates, but a very large n 
compensates. Faster rates are available in parametric models, for example 
when Ut is a fractional ARMA process. However, if the ARMA component 
is misspecified, in that either the autoregressive (AR) or moving average 
(MA) orders are underspecified, or both are overspecified, all parameters 
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will be inconsistently estimated. In [5] estimation of cointegrating subspaces 
in a semiparametric fractional context was studied. A recent parametric 
reference is [19]. 

We consider a narrow-band or local Whittle estimate 9 = {(3, a')' = 7, 61, 
62)' extending that for scalar long-memory series of [20], whose asymptotic 
properties were developed by [29], and further studied by and extended to 
nonstationary or noncointegrated multivariate systems by [18, 22, 26, 33, 34, 
35]. References [36, 37] considered a version of it for cointegrated systems 
but with nonstationary fractional observables, while [24] has alternative re- 
sults in the stationary case. We establish asymptotic properties of 6. For 
estimates that are only implicitly defined, a central limit theorem (CLT) is 
typically preceded by a consistency proof. This is more difficult to establish 
than usual because /3 converges faster than a. Consistency is usually estab- 
lished by showing that, after suitable normalization, the objective function 
converges uniformly in the parameter space to a limit which identifies all 
parameters and can thus be uniquely optimized. In multiparameter models 
this approach only works when all parameter estimates converge at the same 
rate. Additionally, as encountered by Robinson [29] in local Whittle estima- 
tion of the memory of a scalar series, our consistency result is insufficient to 
show that in the usual mean value theorem relations commencing the CLT 
proof, points on line segments between 6 and Oq can be replaced to negligi- 
ble effect by Oq; a slow convergence rate for 61,62 is needed, and established 
using the stronger moment condition in any case required for the CLT. 

The following section describes 9. Section 3 presents regularity conditions, 
a consistency result and CLT, and a small simulation study of finite-sample 
performance. Section 4 contains further discussion. Proofs are in Sections 
5-8. 

2. Local Whittle estimation. For a generic vector wt define the peri- 
odogram matrix I^(A) = n~^{J2t=i wte^^^){Y,t=i wte~^^^)' . Define the Fourier 
frequencies Xj = 27rj/n, for integer j. In connection with (1.2) we allow some 
choice of "working model" for fu{X) near A = 0. Introduce 

^(A; a) = diag{V(A)^i , '0(A)^2e-^^'S'^W^}, 
for a given complex-valued function ^(A) such that ip{—X) = ip{X) and 
(2.1) V(A) - |A| = 0(1) asA^O. 

For example, V(A) = |A| or v{X). Defining ^(A; 9) = ^{X; a)BI^{X)B'^{X; a), 
where 9 = {[3, a')' and B is defined as in (1.12) with /3o replaced by /3, 
consider the objective function 

-| m 

Q{9, Vt) = — 5Z[log det{^(Aj; a)~^0^'(Aj; a)~^} + iT{A{Xj]9)Q.-^}l 
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for 0, £ S, the set of real positive definite 2x2 matrices, and an integer 
m € [l,7T,/2] which satisfies at least 

1 TTl 

(2.2) \ >0 asn^oo. 

m n 

The real function Q is minimized over S by Cl{9) = Re{m~^ Si=i ^(-^j ! ^)}! 
leading to 

= Q(e,f2(&)) = logdet{f)(0)} - 2(^1 + ^2)- E log 

Thus estimate by ^ = argmine R{9), for a compact set G such that 
G = 0^ X 0^ X Qs, with 0^, 0^, 05 chosen as follows. Take Qs = {S: —rji < 
6i < 62 — ri2 < ^ — r]2 — T]'^}, where the rji are arbitrarily small positive num- 
bers satisfying < ryi < min(r/2,%), t]2 + V3 < ^5 GUI' consistency proof neces- 
sitates including a constraint corresponding to (1.14). We allow some Si <0 
because the CLT requires to be interior to 0, and we cover short memory, 
601 = 0. We choose 0^ = [r/4 — 7r/2, 7r/2 — 7/4] for 7/4 € (0, r/3 — rji), so 70 S 0^ 
under (1.4) and (1.6). We can take 0^ to be an arbitrarily large interval, 
possibly including {0}. 

3. Asymptotic and finite-sample properties. Existence of /u(A) implies 
that for p > 2 we can find a 2 x p matrix- valued function C(A) such that 
C(-A) = C(A) and 

(3.1) /„(A) = C(A)C(A)', AG(-7r,^]. 

The representation (3.1) is familiar in case p = 2, but it is then obviously 
available for p > 2. Even when p = 2, C(A) is defined only up to post- 
multiplication by a unitary matrix, and when p> 2 the ambiguity is greater. 
From [12], page 61, existence of /^(A) is equivalent to ut having representa- 
tion 

(3.2) ut = Eut + Y,Cjet-j, t G Z, ^ ||Cj f < 00, 

where {et} is a p x 1 vector process such that Eet = 0, Eete'-f. = Ip (the px p 
identity matrix), Esss't = 0, s^t, s,teZ, Cj = {2Try^ Z^C{X)e~'^^ dX, 
and II • II is Euclidean norm. We will have to strengthen the conditions on et 
for asymptotic theory, but first discuss two other features of (3.2). 

Moving average (MA) representations of long-memory time series models 
have typically been one-sided in particular Cj = 0, all j < 0, in (3.2), imply- 
ing Ut is purely nondeterministic (see, e.g., [11]). (An exception is [8], which 
considers a parametric model.) With Assumption A2, and the stronger As- 
sumption B2 below for central limit theory, a one-sided representation was 
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assumed in [29] in asymptotic theory for local Whittle estimation of memory 
parameter estimation, and subsequently by a number of authors in exten- 
sions of this work. On the other hand, since the basic quantity modeled 
is the spectral density matrix, rather than the process itself, there is no 
essential reason to impose one-sidedness. Indeed, going back to the earlier 
literature one can find repeated examples of bilateral representations in time 
series asymptotics (e.g., [2, 12, 25]). More recently, such representations have 
been employed to model specific (non-Gaussian, short-memory) phenomena 
(see, e.g., [3, 21], as well as examples in the electrical engineering literature, 
say). Our main motivation for allowing a bilateral representation here is to 
indicate its ability to yield any phase under long memory. 

Theorem 2. Let (3.2) hold with {ei} satisfying the conditions that fol- 
low it, and, denoting the (k,i)th element of Cj by cjki, let 

9jM = CjM - {i+Ml{3 > 0) + ^_fcf l(j < 0)}|jf o^^-i 

satisfy 

\9jke - 9j+i,ke\ ^ K\gjke\/{\j\ + 1), gjke = o(|j|'^°'=""^) as \j\ oo, 

for constants k = 1,2 and i = 1, . . . ,p. Then (1.7) and (1.8) of 

Theorem 1 hold with 

K+ = eVie+2S(l - xo, <5o2) + e+iC-2Bi6ou6o2) + e^ie-25(l - xo, ^m), 
K. = eVie+2S(l - xo, <5oi) + CiC+2Bi6o2,Soi) + ^-1^-2^(1 - xo, 502), 
where are pxl vectors with kth elements S,^ke,S,-ke, respectively. 

Section 6 contains a proof sketch. When = = 0, so that ut is purely 
nondeterministic, the relation r(x)r(l — x) = — 7rcsc(7r2;) and trigonometric 
addition formulae may be shown to give (1.6), to extend the known results 
for fractional ARMA models. On the other hand, [6, 22, 23, 24] consider 
purely nondeterministic long-memory vector sequences with zero phases, 
(1.4), and we do not know of Cj satisfying this prescription. However, the 
power-law decay of MA coefficients is only a sufficient condition for power- 
law spectral behavior. When = ^+2 = 0, so Uf has a one-sided forward 
representation, then 70 = — fo'/r/2, the negative of (1.6), and the theorem 
indicates that for bilateral models 70 can take any value, which depends on 
the as well as the 5oi. 

Another difference from the earlier references where MA representations 
are used in asymptotic theory for local Whittle estimates is in the allowance 
for rectangular, not necessarily square, Cj in (3.2), and thus ut generated 
by shocks of higher dimension than the bivariate observable. Note that the 
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equivalence property mentioned when introducing (3.2) is lost when £t sat- 
isfies stronger assumptions, as in Assumption A2 below, but some generality 
can be recouped by the allowance for p> 2. This is natural if xj, yt are seen 
as just two of a vector of related observations that are analyzed pairwise. It 
is also natural if (1.12) is viewed as a consequence of component models for 
xt, Ut, namely xt = at + bt, yt = Poat + ct, where at,bt,ct are unobservable 
sequences such that at has memory 602 and uu = Ct — Pobt has memory 601; if 
the memories of bt and ct differ, then b in Assumptions Bl, B3 and B5 below 
is restricted. We can allow {at,bt,ct) to have a nonsingular spectral density 
matrix by choosing p>3 in (3.1). Note that xt and yt might themselves 
be instantaneous nonlinear functions of raw series Xt, Yt, where Yt and Xt 
are nonlinearly related, for example (in view of evidence of stationary long 
memory and cointegration in nonlinear functions of financial time series, see, 
e.g., [6]), logged squares, with Xt, Yt generated by long-memory stochastic 
volatility models, Xt = AtBt, Yt = A^Ct, where At = e"' , Bt = e''' , Ct = e=* . 
We introduce the following assumptions for our consistency result. 

Assumption Al. Property (1.1) holds, where ut is covariance station- 
ary, and for C(A) in (3.1), 

(3.3) $(A;ao)C(A) -P = o(l) asA^O+, 

where the real 2 x p matrix P satisfies PP' = 0,q, and C(A) is differentiable 
in a neighborhood of A = 0, satisfying there 

(3.4) <^(X;ao)^C(X) = 0(X'^) as A ^ 0+. 

dX 

Assumption A2. {ej in (3.2) satisfy also E{£t\Tt~i) = E{et),E{£te't\ 
Tt-i) = E{ete't), a.s., t G Z, where !Ft is the a-field of events generated by 
Eg, s <t, and also P{£'t£t > ??) < KP{X > rf) for all 77 > for some scalar 
nonnegative random variable X such that EX < 00. 

Assumption A3. Property (2.1) holds. 

Assumption A4. 6*0 G Q. 

Assumption A5. Property (2.2) holds. 

Assumption A6. 

(3.5) < |u;i2| < (u;iiW22)^/^. 
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Assumption A6 on the one hand imphes Oq is positive definite, and on 
the other rules out 

(3.6) uji2 = 0, 

when uit,U2t are incoherent at A = [cf. (1.3)]. Under (3.6) 70 is uniden- 
tifiable. We subsequently discuss related problems in which 70 is known 
and (3.6) is permitted. It could be covered in our theorems with extra de- 
tail, but while (3.6) is milder than the time domain orthogonality condition 
^12 (j) = 0, J G Z, it is less usual in the cointegration setting than (3.5), which 
tends to treat observables as jointly dependent. Assumption Al implies (1.1), 
and this and other conditions are natural extensions or modifications of ones 
in [22, 29, 33]. 

Theorem 3. Let Assumptions A1-A6 hold. Then 

a—>-pao,(3 = l3o + Op(^(^^^ ^ as 00. 

To prove asymptotic normality we introduce the following assumptions. 

Assumption B1. Assumption Al holds, with the right-hand side of 
(3.3) replaced by O(A^), for some 6g (0,2]. 

Assumption B2. Assumption A2 holds, with also the elements of et 
having a.s. constant third and fourth moments and cross-moments, condi- 
tional on J^t~i- 

Assumption B3. Property (2.1) holds for ah 7 G 6^, after replacing its 
right-hand side by 0(A''), 6 G (0, 2]. 

Assumption B4. 6*0 is an interior point of Q. 

Assumption B5. For any C < 00 

(log m)^m^+^^ (logra)^ 

(3.7) ^7 1 >0 as n ^ 00. 

n^" m 

The extensions of the previous conditions are similar to ones in earlier lit- 
erature, the requirement (logri)'^/m —> coping, as in [33], with the fact that 
logn terms are not eliminated at the outset when il^{X) = |A|. Define by S 
the symmetric 4x4 matrix with (k, l)th. element ake, given by an = 2/x{(l — 
2i^o)~^ - (1 - i^o)"^cos2(7o)}u;22Mi, 0-12 = -2fi{l - 1^0)""^ sin(7o)(a;i2Mi), 
(713 = 2/ii/o(l - i^o)"^cos(7o)tJi2/wii, 0-14 = -2fivo{l - i/o)"^cos(7o)a;22/wii, 

0-22 = -<T34 = 2/X/9^, £723 = f724 = 0, (133 = fJ44 = 4 <T34 , where fi = (l-/9^)"\ 

p = UJ12/ {uJiiUJ22)^^'^ ■ Write A„ = diagjA"'^", 1, 1, 1} and let denote a k- 
variate normal variate. 
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Table 1 

Frequency of Wald test rejections, nominal 5% level 



<5oi 


^02 


P 




n = 


: 128 






n = 


512 






n = : 


2048 


m 


/3 


7 




m 


/3 


7 


5i 


m 


/3 


7 ^1 


0.05 


0.45 


0.75 


13 


95.0 


8.6 


18.3 


32 


99.6 


8.4 


25.0 


81 


100 


6.1 38.9 


0.05 


0.45 


0.75 


25 


93.5 


6.0 


59.0 


64 


99.9 


5.5 


76.5 


161 


100 


3.5 83.9 


0.05 


0.45 


0.75 


51 


69.8 


6.0 


99.5 


128 


96.3 


4.5 


100 


323 


100 


6.4 100 


0.05 


0.45 


0.9 


13 


97.3 


5.5 


18.1 


32 


99.8 


6.3 


32.9 


81 


100 


5.0 52.9 


0.05 


0.45 


0.9 


25 


96.4 


4.1 


61.5 


64 


99.9 


3.5 


82.7 


161 


100 


4.2 93.2 


0.05 


0.45 


0.9 


51 


84.1 


2.3 


98.9 


128 


99.9 


4.4 


100 


323 


100 


11.0 100 


0.2 


0.3 


0.75 


13 


92.5 


16.8 


40.6 


32 


94.4 


21.7 


66.8 


81 


95.6 


15.2 94.9 


0.2 


0.3 


0.75 


25 


89.7 


12.0 


88.6 


64 


92.6 


17.0 


99.1 


161 


98.0 


12.9 100 


0.2 


0.3 


0.75 


51 


90.6 


4.9 


100 


128 


93.3 


7.9 


100 


323 


99.6 


11.0 100 


0.2 


0.3 


0.9 


13 


91.9 


15.7 


41.7 


32 


93.3 


16.1 


73.1 


81 


98.7 


12.0 97.3 


0.2 


0.3 


0.9 


25 


88.8 


10.5 


91.5 


64 


95.8 


12.8 


99.8 


161 


99.8 


9.0 100 


0.2 


0.3 


0.9 


51 


91.0 


5.5 


100 


128 


98.0 


7.8 


100 


323 


100 


6.1 100 



Theorem 4. Let Assumptions B1-B5 and A6 hold. Then as n —> oo 

A consistent estimate S of S is formed by plugging in place of 
and elements of Cl{9) for those of VIq. After also replacing A„ by A„ = 
diag{A^~''^, 1, 1, 1}, we can form asymptotically valid confidence regions for 
^0; and also test hypotheses of interest, such as the linear homogeneous re- 
strictions /3o = "no-cointegration" ; (1.4) "zero-phase"; (1.6) "purely non- 
deterministic" ; 7o = ((5oi + 5o2)7r/2 "weak causality"; Jqi = "short-memory 
cointegrating error." A small Monte Carlo study of finite-sample perfor- 
mance was carried out along such lines. To satisfy (1.1), ut was generated 
from the fractional ARMA diag{(l - L)^«i , (1 - L)'^02}(1 - Q.hL)ut = R^^^St, 
where L is the lag operator, the et are bivariate normal, and R has elements 1 
and 4 down the main-diagonal and off-diagonal element 2p. Thus 70 = 
(and UJ12 = 4p/7r). We took 60 = (0.05, 0.45)' and (0.2, 0.3)', p = 0.75 and 
0.9, (3q = 1. On each of 1000 replications, was computed for three val- 
ues of m, [n^/'^], 2n^/^ in each of three sample sizes, n = 128, 512 
and 2048. We employed "(/'(A) = |A| (so local misspecification was incurred), 
and rji = 0.01, r]2 = m = 0-02, = 0.005, 9^3 = [-3,3]. Table 1 gives Wald 
test rejection frequencies, at nominal two-sided 5% level, for the hypotheses 
/3o = (under "/3"), (1.6) (under "7") and Jqi = (under "<5i"). 

The second hypothesis is true so that size is measured, while the others 
are false so that power is measured. When 5q = (0.2, 0.3)' the gap is very 
small (and hard to detect); here the test on 70 is clearly oversized, even 
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for large n, though matters improve for large m, and for 5o = (0.05, 0.45)' 
the sizes are better on average, albeit variable. For the test on Jqi power is 
poor for the smallest m, especially but unsurprisingly when 6oi = 0.05, but 
increases satisfactorily with both. Power for testing Pq is mostly very high. 
Overall, it seems hard to draw firm conclusions about the effect of p, while a 
relatively large m appears to work best. Our technical results can be readily 
adapted to justify score and pseudo-likelihood-ratio-type tests. 

4. Discussion. 

Remark 1. Lack of block-diagonality in S suggests that correctly fix- 
ing a in R{9) or employing an estimate d which converges faster than 
77ji/2 gives an estimate, /3(a), say, that is more efficient than /3, satisfy- 
ing m-'^/^A~'^''{/3(ao) — Po} Ni{0,a^i). Going even further, but assuming 
(1.6), [15] provided an even more precise estimate of /3o, having the same 
efficiency as one minimizing Q{6,n) after replacing a and by known ao 
and this estimate has also the advantage of a closed form representation. 
However, the need to select more than one bandwidth number, and in other 
respects suitably design the estimate of oq, and possibly Qq, presents some 
disadvantage. 

Remark 2. On the other hand, computationally simpler but less effi- 
cient estimates than /3 are available. Reference [27] suggested the narrow- 
band least squares estimate 

{m ^ m 

j=i J j=i 

where {Iyx{X), Ixi^))' makes up the second column of /z(A), and showed it 
to be consistent under very similar conditions to some of those for Theorem 
1; [32] showed it is (n/m)'"'' -consistent (cf. Theorem 1). It advantageously 
avoids estimating ao- Reference [6] showed (3 to be (n/m) '^''m^/^-consistent 
and asymptotically normal under (3.6) and xo < 1/2; [23] gave analogous 
results for a weighted version of (4.1). Even when a CLT for (5, or another 
simple estimate, is available, the limiting variance depends on ao. Under 
(3.5), [32] showed that {n / mY'^ {(3 — Pq) converges in probability to a nonzero 
constant, so no useful inferential result is available. Our (3 corrects the bias. 

Remark 3. Simpler estimates of other parameters are available. We 
can estimate 6oi and 5o2 using univariate local Whittle (see, e.g., [20, 29]), 
bivariate log-periodogram [28] or bivariate local Whittle [22, 33] techniques, 
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though such estimation of 6qi requires a preliminary estimate of (3q. Given 
a prehminary estimate P, a simple estimate of 70 is 



7 = arctan 
where s{X) = lyxW - /34(A). 



Im|^^.(A,)|/Re|^^s(A,) 



Remark 4. When 70 = 0, S is block-diagonal with respect to /3, 6 on 
the one hand and 7 on the other. Treating 70 as an unknown parameter 
seems unique in a long-memory setting, and it is worth noting the effects of 
its prior misspecification. Suppose we fix 7 = 7* in R{9), and then minimize 
with respect to (3,6. Denoting = [Po^l* ,^01,602)' , arguments like those in 
the proofs of Theorems 1 and 2 give 

wii a;i2Cos(7* - 70)" 

_wi2Cos(7* -70) 0J22 

1 element-wise, calculations in the 



Likewise, taking a ~p b to mean a/b 
proof of Theorem 4 give 



2u;i2cos(7o) cj22Cos(7* 

W22COs(7*) 



Thus from (8.3) in the proof of Theorem 4 below, 

dR{e*o) p 2A-'^<' wi2u;22sin(7*-7o)sin(7* 



dp 



l-uo u;iiu;22 - cos2(7* - 70) ' 



It is readily seen that ((9/5(5fc)i?(0o) 0' but due to the nondiagonal limit- 
ing structure of {d"^ / 89 d9')R{6Q) , it appears that unless 7* = 70, or 7* = 0, 
not only is the Pq estimate only (n/m)'^" -consistent but the (5oi estimates are 
inconsistent. When 70 7^ 7* = 0, these estimates are asymptotically normal 
but their limiting variance matrix is complicated, and depends on 70. Our 
discussion suggests a more serious cost to incorrectly fixing 7* 7^ 0, for ex- 
ample, when 7 is replaced by v'n/2 in Q{9,Q), where u = 82 — 5i; see (1.6). 
However, it can also be inferred that such bias problems are absent under 
(3.6). There are two cases of potential interest. In one, (3.6) is assumed a pri- 
ori, in the other it is not; in both 70 is specified. In both cases the estimates 
of Po, 601, 5()2, after correct centering and normalization as in Theorem 4, 
converge to independent zero-mean normal variates, whose variances can be 
deduced from the formulae in S in the latter case (which the CLT of [24] 
addresses) . 



Remark 5. On the other hand, if Pq is known (e.g., to be zero, where 
there is no cointegration) we can infer from Theorems 3 and 4 that after 
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correct centering and m'^/^ normalization, the estimates of 70 and 5q are 
asymptotically independent, with limiting variances given in the inverse of 
the matrix consisting of the last three rows and columns of S. In fact the 
consistency proof is much simpler than that of Theorem 3, and the results 
hold for l^oil < ^, i = 1,2, with 0,5 chosen suitably. 

Remark 6. Also in the known noncointegrated case Pq = 0, consistent 
estimation of 70 , as well as of Sq , is relevant in inference based on the sample 
mean z = (zi + ■ ■ ■ + Zn)/n. Under our conditions it may be shown that as 
n — > 00 

diag{n^/2-''oi , } _ ) 

4 iV2(0, (27rc^,, cos((i - j)7o)/(r(<5oi + <5o,- + 2) cos(7^(5o^ + 5o,)/2)))), 

where the (i,j)th element of the 2x2 variance matrix is indicated. In [30] 
inference was developed in which the cutj and 6q are replaced by consistent 
estimates [better than log-n-consistent in case of ^or which (3.7) suffices] 
but assuming 70 satisfies (1.6). If this assumption is incorrect, a correspond- 
ing confidence ellipse would be inconsistent. This kind of issue does not arise 
under short memory Sqi = 6q2 = 0, where the variance matrix is 27r/u(0), and 
phase is bound to be zero. 

Remark 7. An earlier version of this paper employed a different phase 
parameterization, in place of 7. This naturally covers (1.4) ((pQ = 0) and 
(1.6) {(pQ = Tr/2), but is less natural in general, in view of Theorems 1 and 
2. It affects the form of S, in particular giving nonzero (723 and (724. As a 
consequence, when Pq is known the limiting variance matrix for estimation 
of oq is no longer block-diagonal (cf. Remark 5), while if (j) is incorrectly 
specified to a nonzero value (e.g., 7r/2), is estimated inconsistently; in 
Remark 4, with 7 likewise misspecified, this was due to estimating Pq. On 
the other hand, with the (pu parameterization, [33] compared the cases when 
(j) is correctly fixed at 7r/2, and when (f> is correctly fixed at (where the 
limit distribution is the same as in Remark 5), finding greater precision in 
the former. 

Remark 8. To construct approximate Newton iterations, given an ith 
iterate 9^^\ « > 1, we can form S^*) by plugging in 6^'^^ for 6q in S, re- 
placing elements of f2 by those of Cl{6^^^), and then compute = — 
^ [d/d9)R{6^^^). Choices for 6^^^ include estimates described in Remarks 
2 and 3. If 6^^^ satisfies m^/2^„(^(^) - 6^0) = Op(l), then ^(2) has the proper- 
ties of 6 in Theorem 4. If the initial Pq estimate is only (n/m)*^" -consistent, 
as is (4.1), should satisfy Theorem 4 for some finite i but determination 
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of a minimal i depends on hypothesizing a rate of increase for m with n, 
and on the unknown vq. If a smaller m is used in (4.1) than in R{9), assum- 
ing the former m increases sufficiently slowly relative to the latter one can 
justify i = 2 even. 

Remark 9. With respect to choice of m, minimizing approximate mean 
squared error (MSE) of a given linear combination of elements is compli- 
cated, especially as P converges faster than a. Though suboptimal, the min- 
imum MSE rule (in scalar local Whittle estimation of memory) of [13] could 
be applied, most simply to the Xt sequence (requiring preliminary estimation 
of 5q2 and fixing b in Assumption Bl, say, to 2). As always, a minimum-MSE 
rate violates the assumption (here B5) that provides correct centering in the 
CLT, suggesting use of a smaller m. In univariate local Whittle memory es- 
timation, with data tapering, Giraitis and Robinson [10] developed an m 
that minimizes the error in the CLT, having rate n''/^^"''^^, which satisfies 
B5; with 6 = 2 this is the rate employed in the Monte Carlo. References [16] 
and [17] proposed data-dependent m in univariate log-periodogram memory 
estimation. Full confidence cannot be placed in any automatic technique and 
it may be wise to employ a grid of m values, and assess sensitivity; estimates 
for a given m should be a good starting point for iterations with adjacent 
m. 

Remark 10. From Assumption B5, a, /3 converge slower than n^/^^^'^^\ 
nV2-(i/2-i^o)/(i+2b)^ respectively, for example, n^/s, n(2+'^o)/5 for 6 = 2, while 

for all b the rate of $ approaches n^/^ as ^ ^. This rate is best for esti- 
mates of all parameters if /n(A), A G (— tt, tt], is parametric (extending theory 
of [7, 9, 11, 14]). But misspecification of fu incurs inconsistent estimation 
of all parameters, and if /„ involves additional parameters [over those in 
(1.1)], computational burden increases. The least squares estimate of /3o is 
inconsistent when uu and U2t are correlated (cf. Assumption A6). 

Remark 11. By analogy with the pseudospectrum of univariate non- 
stationary fractional series, we can define a pseudospectral density matrix 
[involving a phase parameter as in (1.1)] for vector series with one or more 
nonstationary elements. Integer differencing of both series will not change 
phase, and may produce the stationary setting of the present paper. Given 
uncertainty as to whether or not the data are nonstationary, or about the 
degree of nonstationarity, alternative methods, already employed to extend 
univariate local Whittle estimates (e.g., [34, 35]), should produce analogous 
asymptotic properties to those in Theorems 3 and 4, albeit perhaps with 
some variance inflation, so long as the gap between memory parameters is 
less than ^ [as in (1.14)]. If this gap exceeds ^ optimal estimates have a 
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faster rate, and mixed normal asymptotics [15]. Reference [36] considered 
local Whittle estimation with a gap exceeding |, but the estimate of Po 
achieves a slower convergence rate than is attainable even by such simple 
estimates as (4.1) and least squares when also the sum of memory parame- 
ters exceeds 1. 

Remark 12. Another kind of extension concerns multivariate series zt 
of dimension q > 2. Reference [24] considers local Whittle estimation with 
q >2 and a single cointegrating relation, though with phases correctly as- 
sumed to be zero, (3.6) assumed in the CLT, and a consistency proof which, 
like ours, takes q = 2. More generally, q > 2 raises the possibility that 
the number, r < q, of cointegrating relations exceeds 1. In (1.12), Bq can 
be redefined by replacing the I's in the diagonal by blocks Ir and Iq-r, 
with Pq now being an r x (q — r) matrix. Likewise in (1.1), (1.2) the di- 
mension is extended to q, with, for j G [2,q], the jth diagonal element of 
$(A;a) now being |A|''je~*^'s°('*')('''i+ ' +'^J-i\ with 5i < 6j for i < r, j > r. 
Thus a = (71, ... , 7q_i, 5i, . . . , (5q)' unless, to mitigate possible curse of di- 
mensionality and additional computational challenge, prior restrictions are 
imposed, for example, 61 = ■ ■ • = 5r and/or 6r+i = ■ ■ • = 5q. Such constraints 
could imply some zero 7^ even under fractional integration assumptions [cf. 
(1.6), which is zero for 5oi = 602], but in general they can be unrestricted. 
Prior restrictions on /3o might also be imposed. Our methods can be straight- 
forwardly extended to estimate the remaining, unknown, parameters. The 
techniques of proof of Theorems 3 and 4 also appear to extend, while The- 
orems 1 and 2 clearly remain relevant. 

5. Proof of Theorem 1. From [38], page 186, 
00 

(5.1) ^ jXo-igiiA ^ r(xo)e^"'^«/^A->^« + 0(1) as A ^ 0+. 
i=i 

For A/0, mod(27r), 

{00 00 00 "j 

ri2(0) + ^+^j>^«-ie-^^VAC-^i>^°-^e^^'V bjC-^'^ • 
i=i i=i bl=i J 

The last term in braces is bounded by 

N 00 

Y,i\bj\ + \b.j\)+ {\bj-bj+i\ + \bj-b.j.i\} 

j=l j=N+l 

< Ke{m^ + iV^o-i|A|-^) = o(|A|-») as A ^ 0, 
where e > is arbitrary and we choose ~ |A|~^. Thus from (5.1), 
/12(A) ~ (27r)-i(K+e-^^'s'^(^)"^«/2^K_e^^'s'^(^)"^»/2)r(;^o)|A|-xo as A ^ 0. 



k=N 
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Then (1.9) is determined by inspection, and the remaining statements are 
straightforwardly verified. 

6. Proof of Theorem 2. Take i > 0. With c^^- denoting the ith row of Cj, 

write 

oo j -1 

ri2(.j)= CuC2,i-j +Y,CuC2,i-j + c'liC2,i-i. 

i=j+l i=0 i=—oo 

Each of the three terms on the right-hand side is dominated by contribu- 
tions in which cii,C2^i-j are of order and — respectively, 
the remainder terms involving products of these with the guc gi-j.21 and 
products of the latter. After integral approximation of the leading terms we 
write 

ruU) = |^;i^+2 (1 + x)'°^-'x'<^^-' dx 

(6.1) +diC-2 / x^oi~^(l-2;)'^«^~idx 

Jo 

poo > 

+ di-2 x^«i^Hl + x)^o2-i^^jj-xo-i + 5^.. 

We omit the straightforward but lengthy proof that bj satisfies (1.7) and 
(1.8). It only remains to express the integrals in (6.1) as Beta functions. 
The method of proof for j < is identical. 

7. Proof of Theorem 3. We first give the proof with "o" replaced by 
"O" in the error bound for (3. For any c > define neighborhoods Mp{c) = 
{/3:|/?-/3o| <c}, AA^(c) = {7:|7-7o| <c}, Ms{c) = {5 --Wd - 5o\\ < c}. Fix 
e > and define N{e) = Np{e~^{m/nY") x N^{e) x Msie), #(e) = 6 \N{e). 
We have P{e £ f^{e)) < P{mfj^^^^{R{e) - i?(6'o)}). To show that this tends 

to zero we first decompose R{0) — R{9o). We omit the straightforward proof, 
using Assumption A3, that the effect of replacing ^'(A;q!) by $(A;q;), when 
they differ, is negligible, uniformly on j^(e), and proceed as if ^' = Then 

2 1 

(7.1) Rie) - RiOo) = logdet{n{9Meo)-'} - 2 ^ 0- E log 

i=l j 

where Q = Si - doi and J2j means Y:T=i- ^ith T{6) = diag{(2Ci + l)^/^ 
(2(2 + 1)1/2}, H(0) = diag{Afl,A^,}, n*{e) = E{6)n{9)E{e), write 

R{9) - R{9o) = logdet{T{6)n*{9)T{5)n{9o)~^} + u{6), 
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where u{5) = Ef=i{2C» - log{2Q + 1) + 2Ci(log?n - Y.j logj - !)}■ To 
decompose (l*{9) denote I^j = lui^j), and deduce 

Bh{\j)B' = luj + {Bo - B)Iuj + Iuj{Bo - B)' + {Bq - B)Iuj{B - Bq)' . 

With the definitions Hj = ^'(Aj; ao)4i^(Aj; ao), KiP) = X^^iPo - P), t = 
7 — 7o , rearrangement gives 

(7.2) n*ie) = G«(a) + 62(/j)G(3)(c,), 

where G'»(a) = (52), g^,! = m-'j:^{j /mf^" h.^j {k = 1,2), g^^^ = g^^^ = 
(2m)-^Ei(iM)^^+^'(e'"/ii2i + e-*^/i2ii), M? = {2m)-^j:j{j /m)'^^+^^-^'^^ x 

m'^ T.jij/mf^^^~^"^^h22j, 9^22 = 9i2 = 921 = 922 = 0) suppressing reference 
to dependence on a in the and with Hj = (hkej)- Defining 

we have R{e) - R{eo) = Ua{a) + Up{e), since ^?(0o) = &^Hao)- Writing 
^f3{c) = 6^-^/3 (c), ^-yic) = e^\A/-^(c), J^sic) = &5\Ms{c), and also 6^ = 
X Qs, ^a{c) = {^7(c) X G5} U {9^ X ^s{c)}, it suffices to show that as 
n—> 00 

(7.3) p( inf [/„(«) <o) ^0, 

(7.4) p( inf C//3(0) <o) ^0. 

W^(l/e{n/m)''0)xec / 

Introduce the fohowing population analogues of the g^^\: g^j^ = uJkk{2Ck + 
l)-i {k = 1,2), 5^2^ = g!^^ = (Ci + C2 + l)-^wi2 COST, gi? = 2(Ci +5i- 5o2 + 
I)"^u;i2cos7o, 5-12'' = 92i = (<^i + C2 - (^02 + l)"^t^22 C0S7, g[f = {2{di-6o2) + 

l)-^^22, 9^2^ = 9?^ = 9^i = 9?^ = 0; write G»(a) = (^g). 

To prove (7.3) observe that from the inequality |log(l + x)\ < 2\x\ for 
\x\ < ^, and because f^ai^) C {.My(e) x 0^} U^si^), it suffices (following a 
development like that in [22]) to show 

(7.5) sup||T(<5){GW(a)-G«(a)}T(5)||i^O, 

(7.6) sup||{T(5)GW(a)T((^)}"^|| < 00, 
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(7.7) inf logdet{T((5)GW(a)T(d)GW(Qo)^^} > 0, 

(7.8) lim inf u{6) > 0. 

We omit the details of (7.5) as these are now standard, mainly following the 
proof of Theorem 1 of [29], and multivariate extensions [22, 33]. Our model 
(3.2) is more general than those in such references in two respects, namely 
our allowance for a bilateral MA and for the dimension of et to exceed 2, 
but it is readily seen that neither extension materially affects the proof. The 
basic technique involves summation-by-parts (to deal with the uniformity) 
followed by approximation of the Hj by the PIsjP' , where Isj = lei^j) (see 
[29]) and then approximating the consequent term in the by one in 
Qq (with only a second moment for et required for the latter step due to 
applying a law of large numbers for Li variables to the term in the et^t) and 
approximating sums of form m~^^j{j /niY by (1 + a)~^ for a > —1. The 
most significant difference from earlier results is the presence of the general 
7, 7o, but this is easily handled in view of compactness of G-y. Likewise, (7.8) 
follows from the proof of Theorem 1 of [29] , which used the inequalities 



£2 



(7.9) inf {x-log(a; + !)}>— , 

\x\>e 6 



logm — m ^ y~^logj — 1 



To prove (7.6) observe that 

(7.10) det{T(5)G«(a)T(5)} 

where c{6) = (2Ci + 1)(2C2 + l)/(Ci + C2 + 1)^- It follows from the inequal- 
ity < 4xy < {x + y)2, for x,y > 0, that < c{6) < 1, and thus (7.10) 
> det(f7o) > 0. 

To prove (7.7) note that 

(7.11) log det{T(^)G(^) (a)T(J)G(^) (ao)-^ = log{ ^ " " " 

From |cosr| < 1, \c{S)\ < 1 and log(l + 3;) > x/{l + x) for x>0, this is lower- 
bounded by p^{l — c{6) cos^ r} > sin^ r. Because sin(7r — x) = — sinx, 

(7.12) inf sinV > minjsin^f -Y sin2(2r/4)l > 0. 
Af^{e)xes I \2/ J 

Since p^O, (7.7) is proved. 

Now consider (7.4). We can write C/^(6') = log Q (&«(/?)), where Q{s) = 1 + 

ais + 025 , ai - [g^^ " ^9i2 9i2 )/det|Gi (a)j, 02 - (5ii ^22 " 9i2 )l 
det{G^^''(a)}. For all ^, 02 > by the Cauchy inequality, and, since f2*(0) 
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and &^\a) are nonnegative definite, Q{s) is nonnegative for all real s. It 
has a global minimum at s = — ai/2a2- Thus 



inf Q(s) > 1 



1 



4a2 
loi' 



ai 



2a2 



> 



1 



+ 1 



e 



«2 



e 



02 

r2 





K 


2a2 


1 

> - 

e 



ai 



202 



< 



1 



|oi| 

where l(-) denotes the indicator function. Thus the probability on the left- 
hand side of (7.4) is bounded by 

1^ 



(7.13) 



PI log 1 



ml , . .02 

sup h mt 



<0 +P 



sup 



Ol 



209 



> 



/ , , 2 , ,1 

< 2P sup |ai — Ol I H — sup |a2 — 02 1 > - inf 02 — sup |oi 
V e 0a 



by elementary inequalities, where oi = (511^522^ —25^2^5(2^)/ detG(*)(a), 02 = 

iaSaS - ffif )/det G(*)(a). Now supe„ 1^2 " 9^1 (i = 2,3, fc,^ = 
1,2) as n ^ 00, by the same method of proof as described for (7.5), so 
supq^ loj — Oil — >p (z = 1,2) as n — > 00. We need to show that the right- 
hand side of the last inequality in (7.13) is positive. It is easily seen that 
supQ^ |oi| < cx), noting boundedness away from zero on Qa of denominators 

(i) 

in the g^.^. Since e can be arbitrarily small we require only that infe^ 02 > 0. 
This is true because, on 0^, 



(3) (1) 
9ll922 



(2)2 
9l2 



2 

■ ^22 



1 



cos^ 7o 



22 



{2{6i - 5o2) + 1}(2C2 + 1) 
1 



iSi+Ci-602 + iy 
1 



{2{6i - 5o2) + 1}(2C2 + 1) {Si - Sq2 + C2 + I) 



^ ^22^"^ > ^22^2 ^ Q 



This completes the proof that uq, $ = (3o + 0p{{m/ny°). To replace "O" 
by "o" in the latter, for e e (0, 1) define A/"* (e) = M^ie^/^im/nyo) xM^ie^) x 
Msif), ^*{e) = e\M*{e). We have P(^ G^*(e)) < P{9 € f^*{e)nM{e)) + 
P{6 £ ^{e)). We have just shown that the last probability tends to zero. For 
the previous one it suffices to show that as n ■ 



00 



(7.14) 
(7.15) 



Pi inf Uo,ia)<0 

W*(e) 

inf UfsiO) < 



0, 
0, 



20 P. M. ROBINSON 

where Ma{e) =M^{e) x N&{e), ^*{e) =J\fa{e) \M*{e). The proof of (7.14) 
is as above. To prove (7.15), following the argument up to (7.13) we have to 
show 

pi sup |ai — ai j + 2e^/^ sup \a2 — a2\> e^^'^ inf a2 — sup |ai|) 

\Afc{e) J\fc(e) ^o.(e) ^.(e) / 

(7.16) 

In view of the above remarks about (7.13) it remains to show that the right- 
hand side of the inequality in (7.16) is positive. We have 

sup |ai| < ( sup gf^g'i^ -2g't^gf^ ] I inf det{Gl')(Q)}. 

The denominator is already known to be finite and the quantity on the 
right-hand side whose absolute value is taken equals 

cos 7o cos 7 cos r 



.(2C2 + l)(Cl + '^l - ^^02 - 1) (Cl + C2 + l)(5l + C2 - <502 + 1). 

After rearrangement and application of trigonometric addition formula, this 
is seen to be bounded in absolute value by i^(|7 — 7o| + — 5o||). It follows 
that sup_/v-^(£) |ai| < Ke. From the proof of Theorem 3, e-"^/^ inf_v'Q(e) 02 — 
sup |ai| > e^/"^ /K — Ke, which, for arbitrarily large K, is bounded below by 
e^l'^jlK > 0, choosing e G (0, (4/^^)-^). 

8. Proof of Theorem 4. Define s{e) = {d/d9)R{9), S{9) = {d/de'Js{9). 
Denote by S the matrix S{9) when its kth row is evaluated at 9 = §^''1 If 
||(9(fc) _ < _ A; = 1, ... ,4, the mean value theorem gives 9 — 9q = 
S~^s{9q), for some such 9^''\ The theorem is established if 

(8.1) m'/^A~h{9o)^N40,^), 

(8.2) ^ S. 

Denoting by 9k, Sk{9), the A;th elements of 9, s{9), and by Ski{9) the (/c,£)th 
element oi S{9), 

(8.3) s,(0) = tr|^^A(0)-i|-l(A: = 3or4)l5^1og|^(A,;7)|, 
Now 

dn{9) 
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k,£ = l, 2, writing Af' = {d/d9k)Aj, A^'''^^ = {d/dei)Af\ Aj = A{Xf,9). To 
simplify we proceed, as in the proof of Theorem 3, as if tp{X) = |A|. This can 
be justified via Assumption B3; further discussion appears later in the proof. 
Define E^e by replacing the (k, i)ih element by 1 in the 2x2 matrix of zeros. 
Noting that E^B' = -E12 we deduce Af^ = -Xj^iEuAje'^ - AjE2ie-'"'), 
Af = iAjE22 - iE22Aj, = (log Xj){EkkAj+AjEkk), k = l,2, = 

2Xf''E^2AjE2i, Af'^^ = iXj''{E22AjE2i-E^2AjE22), Af'^+'''> = -(log A,) x 
'^{EkkEuAj + EkkAjE2i + Ei2AjEkk + AjE2iEkk), ^f''^^ = 2E22AjE22 - 
E22Aj - AE22, '2+'=) = -i{logX.j){EkkAjE22 - EkkE22Aj - E22AjEkk + 

AjE22Ekk)-, A^^^''''^'^^^ = (log Xj)"^ {EkkEeeAj + Aj x EuEkk + EkkAjEu + 
E^AjEkk)- Thus from (8.3), with Aqj = A{Xj;6o), 

siiOo) = -tT^Y.^P(^^^^oje'^''+AojE2ie-'^°)Cl{eor\ 
j 

52(^0) =^tr|^^(Ao,E22 -^22A0j)f^(^0)~'|, 

S2+fc(^o) = tr — ^[log Xj -—Y,^ogXi]{EkkAoj + AojEkkMOo)~\ 

] \ I J 

for /c = 1, 2, where the real part operator is omitted because imaginary parts 
are automatically eliminated here, and we use ^(Qq) = m"^ Ylij RejAoj}. We 
can replace, with negligible error, O(0o) by and A^j by Tj = PlejP' in 
m^/^A~^s(^o)i using arguments of [22, 29, 33], and allowing p>2. Thus 
m^/^A~^s(^o) differs by Op(l) from m^/^A~^s*(6'o), where s*{6o) has kth. 
element 

(8.5) = ^^^^"^^i ^^{h,} + Uikj Im{/,, }), 

j 

where Umj = -cos-fo{Xp - m-^Y.iK"°)P'^Q^ E12P, Un, = 

- sin 7oA7"°P'Oo 1^12 P, UR2j = 0, Ui2j = P'17o'^22P, UR^2+k,j = (logi - 

m~^X]ilogO ^ P'^o^EkkP-, Ui^2+k,j = 0, for A; = 1,2. After rearrangement 
and application of a martingale CLT we deduce, following the same refer- 
ences, A~^s* /2 N4[0,Tj). [The formula for S can be most easily veri- 
fied after noting that Esls} = 8m~'^ Y.j ^^{URkj{U'ji^j + Umj) + Uikj{Uj(^j - 
Ujij)}, plus a negligible fourth cumulant term.] This completes the proof of 
(8.1). 

Turning to (8.2), it suffices to show that 

(8.6) A,-H5-5(0o)}A-i^O, 
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(8.7) iA-i5(0o)A;;^-S. 

We omit the straightforward proof of (8.7). To prove (8.6), we require a 
rate of convergence for the 5,. Put 6 = {(5, a')' = (/?, 7,(^1, 52)', for \\6 — 9q\\ < 
\\0 — 6q\\. For some such 6, i),{6) appears in all elements of S. From Sec- 
tion 7 we can write, with the same definitions, and ^ again replaced by <I>, 
fi(^) = H(5){G(i)(d) + 6„(/3)G(2)(d) + 62(^)G{3)(a)}H(5). Then from Theo- 
rem 3, n{e) - Cl{eo) ^p0if5 -^p 60 and (a) - G^*) {ao)^pO,i = l, 2, 3. 
To achieve the latter, Hj = Aqj can be replaced as before by the Tj, but 
from the definitions of Section 7 the 6k are involved as exponents of (j/n), 
J = 1, . . . , m, in the G^^^a), so more than the consistency established in 
Theorem 3 is needed (though consistency of 7 suffices). So far as remaining 
terms which make up elements of S are concerned, similar considerations 
apply, indeed differentiation produces factors log \ log^ IV'jl in some sum- 
mands. In [29], only ^(A) = |A| was considered, and logn terms are precisely 
eliminated prior to taking limits, as in Section 7. With more general ipj this 
does not happen, as in [33] 's choice of tjj, and as there we establish something 
a little stronger. It suffices to show that (logn)'^((5fc — Jofc) —^p 0, A: = 1, 2, for 
any C < 00 [explaining the requirement (logn)^ /m ^ in (3.7)]. Arguing as 
before, this follows if, as 00, sup_;y/-^(^-) ||T((^){G(^) (a) — G^^\a)}T{6)\\ = 
opiilogn)-^^), inf_^_^(^,^^^^(^^logdet{T(<5)G(i)(a)T(5)G(i)(ao)-i} > 0, 
lisin-»oo(logn)2C X inf^^^ 

£/(log„,)C) ^('^) > 0' ^ ^ (O'l)- The first re 

suit follows by straightforward extension of the proof of (4.6) in [29], the 
rate being due to £t now having a finite moment of order greater than 2. 
The proof of the second is identical to that of (7.7), the only difference in 
outcome being the replacement of e by . As in the proof of [29] , we deduce 
the final result from the inequalities in (7.9). 
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